XRegExp 3.2.0
XRegExp provides augmented (and extensible) JavaScript regular expressions. You get modern syntax and flags beyond what browsers support natively. XRegExp is also a regex utility belt with tools to make your grepping and parsing easier, while freeing you from regex cross-browser inconsistencies and other annoyances.
XRegExp supports all native ES6 regular expression syntax. It supports Internet Explorer 5.5+, Firefox 1.5+, Chrome, Safari 3+, and Opera 11+. You can use it with Node.js or as a RequireJS module.
Performance
XRegExp compiles to native RegExp
objects. Therefore regexes built with XRegExp perform just as fast as native regular expressions. There is a tiny extra cost when compiling a pattern for the first time.
Usage examples
var date = XRegExp(`(?<year> [0-9]{4} ) -? # year
(?<month> [0-9]{2} ) -? # month
(?<day> [0-9]{2} ) # day`, 'x');
var match = XRegExp.exec('2017-02-22', date);
match.year;
var pos = 3;
var result = [];
while (match = XRegExp.exec('<1><2><3><4>5<6>', /<(\d+)>/, pos, 'sticky')) {
result.push(match[1]);
pos = match.index + match[0].length;
}
XRegExp.replace('2017-02-22', date, '${month}/${day}/${year}');
XRegExp.replace('2017-02-22', date, (match) => {
return match.month + '/' + match.day + '/' + match.year;
});
date.test('2017-02-22');
'2017-02-22'.replace(date, '$2/$3/$1');
var evens = [];
XRegExp.forEach('1a2345', /\d/, (match, i) => {
if (i % 2) evens.push(+match[0]);
});
XRegExp.matchChain('1 <b>2</b> 3 <B>4 \n 56</B>', [
XRegExp('(?is)<b>.*?</b>'),
/\d+/
]);
var html = '<a href="http://xregexp.com/">XRegExp</a>' +
'<a href="http://www.google.com/">Google</a>';
XRegExp.matchChain(html, [
{regex: /<a href="([^"]+)">/i, backref: 1},
{regex: XRegExp('(?i)^https?://(?<domain>[^/?#]+)'), backref: 'domain'}
]);
XRegExp.union(['a+b*c', /(dog)\1/, /(cat)\1/], 'i', {conjunction: 'or'});
These examples give the flavor of what's possible, but XRegExp has more syntax, flags, methods, options, and browser fixes that aren't shown here. You can also augment XRegExp's regular expression syntax with addons (see below) or write your own. See xregexp.com for details.
Addons
You can either load addons individually, or bundle all addons with XRegExp by loading xregexp-all.js
.
Unicode
If not using xregexp-all.js
, first include the Unicode Base script and then one or more of the addons for Unicode blocks, categories, properties, or scripts.
Then you can do this:
var unicodeWord = XRegExp('^\\pL+$');
unicodeWord.test('Русский');
unicodeWord.test('日本語');
unicodeWord.test('العربية');
XRegExp('^\\p{Hiragana}+$').test('ひらがな');
XRegExp('^[\\p{Latin}\\p{Common}]+$').test('Über Café.');
By default, \p{…}
and \P{…}
support the Basic Multilingual Plane (i.e. code points up to U+FFFF
). You can opt-in to full 21-bit Unicode support (with code points up to U+10FFFF
) on a per-regex basis by using flag A
. This is called astral mode. You can automatically add flag A
for all new regexes by running XRegExp.install('astral')
. When in astral mode, \p{…}
and \P{…}
always match a full code point rather than a code unit, using surrogate pairs for code points above U+FFFF
.
XRegExp('^\\pS$').test('💩');
XRegExp('^\\pS$', 'A').test('💩');
XRegExp('(?A)^\\pS$').test('💩');
XRegExp('(?A)^\\pS$').test('\uD83D\uDCA9');
XRegExp.install('astral');
XRegExp('^\\pS$').test('💩');
Opting in to astral mode disables the use of \p{…}
and \P{…}
within character classes. In astral mode, use e.g. (\pL|[0-9_])+
instead of [\pL0-9_]+
.
XRegExp uses Unicode 9.0.0.
XRegExp.build
Build regular expressions using named subpatterns, for readability and pattern reuse:
var time = XRegExp.build('(?x)^ {{hours}} ({{minutes}}) $', {
hours: XRegExp.build('{{h12}} : | {{h24}}', {
h12: /1[0-2]|0?[1-9]/,
h24: /2[0-3]|[01][0-9]/
}),
minutes: /^[0-5][0-9]$/
});
time.test('10:59');
XRegExp.exec('10:59', time).minutes;
Named subpatterns can be provided as strings or regex objects. A leading ^
and trailing unescaped $
are stripped from subpatterns if both are present, which allows embedding independently-useful anchored patterns. {{…}}
tokens can be quantified as a single unit. Any backreferences in the outer pattern or provided subpatterns are automatically renumbered to work correctly within the larger combined pattern. The syntax ({{name}})
works as shorthand for named capture via (?<name>{{name}})
. Named subpatterns cannot be embedded within character classes.
See also: Creating Grammatical Regexes Using XRegExp.build.
XRegExp.matchRecursive
Match recursive constructs using XRegExp pattern strings as left and right delimiters:
var str = '(t((e))s)t()(ing)';
XRegExp.matchRecursive(str, '\\(', '\\)', 'g');
str = 'Here is <div> <div>an</div></div> example';
XRegExp.matchRecursive(str, '<div\\s*>', '</div>', 'gi', {
valueNames: ['between', 'left', 'match', 'right']
});
str = '...{1}.\\{{function(x,y){return {y:x}}}';
XRegExp.matchRecursive(str, '{', '}', 'g', {
valueNames: ['literal', null, 'value', null],
escapeChar: '\\'
});
str = '<1><<<2>>><3>4<5>';
XRegExp.matchRecursive(str, '<', '>', 'gy');
XRegExp.matchRecursive
throws an error if it scans past an unbalanced delimiter in the target string.
Installation and usage
In browsers (bundle XRegExp with all of its addons):
<script src="xregexp-all.js"></script>
Using npm:
npm install xregexp
In Node.js:
var XRegExp = require('xregexp');
In an AMD loader like RequireJS:
require({paths: {xregexp: 'xregexp-all'}}, ['xregexp'], (XRegExp) => {
console.log(XRegExp.version);
});
About
XRegExp copyright 2007-2017 by Steven Levithan. Unicode data generators by Mathias Bynens, adapted from unicode-data. XRegExp's syntax extensions and flags come from Perl, .NET, etc.
All code, including addons, tools, and tests, is released under the terms of the MIT License.
Learn more at xregexp.com.